Skip to content

Automated Test: embed-url-handling-post #311

Closed

Conversation

admin-coderabbit
Copy link
Owner

@admin-coderabbit admin-coderabbit commented Feb 4, 2026

This pull request was automatically created by @coderabbitai/e2e-reviewer.

Batch created pull request.

Summary by CodeRabbit

Release Notes

  • New Features

    • RSS/Atom feed polling automatically imports external feed content
    • Forum discussions can now be embedded on external websites
    • Topics can be imported from remote URLs with automatic content parsing
  • Configuration

    • New site settings for embedding host configuration, feed polling, and content authorship

@coderabbit-eval
Copy link

coderabbit-eval bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

This PR implements a complete embedding system for Discourse that enables external websites to embed discussion topics. It introduces a new TopicEmbed model for tracking embedded content, an EmbedController for rendering embedded discussions, background jobs for feed polling and topic retrieval, client-side assets for embedding iframes, and comprehensive configuration infrastructure with localization support.

Changes

Cohort / File(s) Summary
Core Embedding Model
app/models/topic_embed.rb, db/migrate/20131217174004_create_topic_embeds.rb
Introduces TopicEmbed model with import logic to create/update posts from external URLs, feed integration via SimpleRSS, HTML content processing with Nokogiri, and database schema for tracking embedded topics.
Post Cooking Enhancement
app/models/post.rb, db/migrate/20131219203905_add_cook_method_to_posts.rb
Adds cook_method enum to posts table enabling raw_html cooking mode, bypassing standard post rendering pipeline for RSS-imported content.
Embed Controller & Views
app/controllers/embed_controller.rb, app/views/embed/best.html.erb, app/views/embed/loading.html.erb, app/views/layouts/embed.html.erb
Creates EmbedController#best action with embeddability validation, X-Frame-Options header management, and three view templates for displaying embedded topics, loading state, and iframe communication.
Background Jobs
app/jobs/regular/retrieve_topic.rb, app/jobs/scheduled/poll_feed.rb
Introduces RetrieveTopic job for async topic fetching and PollFeed scheduled job for hourly RSS/Atom feed polling with TopicEmbed integration.
Topic Retrieval Library
lib/topic_retriever.rb
Encapsulates topic embed retrieval with host validation, throttling via Redis, and delegation to TopicEmbed.import_remote with feed polling support.
Client-side Assets
app/assets/javascripts/embed.js, app/assets/stylesheets/embed.css.scss
Adds embed iframe initialization script with postMessage-based resize handling and styling for embedded post display with headers, footers, and author information.
Configuration & Localization
config/site_settings.yml, config/locales/client.en.yml, config/locales/server.en.yml, config/routes.rb
Defines embedding site settings (embeddable_host, feed_polling_enabled, feed_polling_url, embed_by_username), translation keys for embed UI, and GET /embed/best route.
Dependencies & Support
Gemfile, lib/post_creator.rb, lib/post_revisor.rb, lib/tasks/disqus.thor
Adds ruby-readability and simple-rss gems, enables cook_method passthrough in PostCreator, adds skip_validations flag to PostRevisor, and migrates Disqus import to use TopicEmbed.import_remote.
Database Infrastructure
db/migrate/20131210181901_migrate_word_counts.rb, db/migrate/20131223171005_create_top_topics.rb
Formatting change to word counts migration; modifies top_topics table creation to use force: true flag.
Comprehensive Test Coverage
spec/components/topic_retriever_spec.rb, spec/controllers/embed_controller_spec.rb, spec/jobs/poll_feed_spec.rb, spec/models/topic_embed_spec.rb
Adds test suites validating TopicRetriever host/throttle logic, EmbedController access controls and topic resolution, PollFeed execution conditions, and TopicEmbed import workflow with URL handling.

Sequence Diagram(s)

sequenceDiagram
    participant Browser
    participant EmbedController
    participant TopicEmbed
    participant RetrieveTopic as RetrieveTopic Job
    participant TopicRetriever
    participant Database

    Browser->>EmbedController: GET /embed/best?embed_url=...
    EmbedController->>EmbedController: validate embeddable_host & referer
    EmbedController->>TopicEmbed: topic_id_for_embed(url)
    alt Topic exists in database
        TopicEmbed-->>EmbedController: topic_id
        EmbedController->>Database: TopicView(topic_id, best: 5)
        Database-->>Browser: render best posts
    else Topic not embedded yet
        TopicEmbed-->>EmbedController: nil
        EmbedController->>RetrieveTopic: enqueue(embed_url)
        EmbedController-->>Browser: render loading view
        RetrieveTopic->>TopicRetriever: retrieve
        TopicRetriever->>TopicRetriever: validate host & throttle
        TopicRetriever->>TopicEmbed: import_remote(user, url)
        TopicEmbed->>Database: create post & topic_embed record
    end
Loading
sequenceDiagram
    participant Scheduler
    participant PollFeed as PollFeed Job
    participant SimpleRSS
    participant TopicEmbed
    participant PostCreator
    participant Database

    Scheduler->>PollFeed: execute (hourly)
    alt Settings enabled
        PollFeed->>SimpleRSS: parse feed from feed_polling_url
        SimpleRSS-->>PollFeed: RSS items
        loop For each feed item
            PollFeed->>PollFeed: extract url, title, content
            PollFeed->>TopicEmbed: import(user, url, title, content)
            alt First time importing URL
                TopicEmbed->>PostCreator: create post with raw_html method
                PostCreator->>Database: insert post & topic_embed
            else URL already imported
                TopicEmbed->>TopicEmbed: compute content_sha1
                alt Content changed
                    TopicEmbed->>Database: revise post content
                end
            end
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Hop along, dear friends, to embeds so bright,
External discussions, brought into light,
RSS feeds now poll through the day,
Topics retrieved in a clever new way,
With Readability parsing and postMessages too,
We've woven the web to embed just for you!

🚥 Pre-merge checks | ✅ 1 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'Automated Test: embed-url-handling-post' does not accurately reflect the main changes in this PR, which implement comprehensive embedding and RSS feed functionality. Use a more descriptive title that captures the core feature being added, such as 'Add embed and RSS feed polling functionality' or 'Implement topic embedding with feed import support'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch embed-url-handling-post

Comment @coderabbit eval help to get the list of available commands and usage tips.

@coderabbit-eval coderabbit-eval bot closed this Feb 4, 2026
Copy link

@coderabbit-eval coderabbit-eval bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🤖 Fix all issues with AI agents
In `@app/assets/javascripts/embed.js`:
- Around line 5-12: The code assumes
document.getElementById('discourse-comments') always returns an element and that
discourseUrl ends with a slash; add a null-guard and normalize discourseUrl:
fetch the container via document.getElementById('discourse-comments') and if it
is null, no-op (or bail out) before creating/appending the iframe, and ensure
discourseUrl ends with '/' (append one if missing) before building iframe.src
using discourseEmbedUrl; update references:
document.getElementById('discourse-comments'), comments,
comments.appendChild(iframe), iframe.src, discourseUrl, and discourseEmbedUrl.
- Around line 15-24: The origin check in postMessageReceived is insecure because
it uses discourseUrl.indexOf(e.origin); replace that with a strict origin
comparison (e.origin === discourseUrl) or compare against a whitelist of allowed
origins, ensuring you reference the same discourseUrl and e.origin variables;
update the conditional in the postMessageReceived function to only accept exact
matches before processing e.data and setting iframe.height.

In `@app/jobs/scheduled/poll_feed.rb`:
- Around line 31-37: i.content can be nil causing a NoMethodError when calling
.scrub; update the RSS item handling in the loop so you normalize the item body
before scrubbing/using it — e.g. derive a string body by checking i.content,
then fallback to i.content_encoded or i.description (or empty string), call
.to_s.scrub on that result, and pass the cleaned string into TopicEmbed.import
along with url and i.title; ensure the nil-guarding happens before calling
.scrub and reference the loop handling where i.content is accessed and the
TopicEmbed.import call.
- Line 29: Replace the unsafe Kernel#open call by validating and fetching the
feed URL using URI and a direct HTTP client: parse SiteSetting.feed_polling_url
with URI.parse, ensure the scheme is "http" or "https", fetch the response body
with Net::HTTP (or a similar HTTP client) and then pass that body string to
SimpleRSS.parse; update the code around SimpleRSS.parse and
SiteSetting.feed_polling_url to perform the scheme validation and HTTP fetch and
add basic error handling for invalid URIs or network failures.

In `@app/models/topic_embed.rb`:
- Around line 44-53: The import_remote method uses open(url).read which is
unsafe (SSRF, no timeouts/size limits); replace it with a validated HTTP fetch
that only allows http/https, enforces open and read timeouts, and caps response
size before passing content to Readability::Document; update TopicEmbed.import
call to handle fetch errors (timeout, non-2xx, oversized body) by rescuing and
returning/logging a sensible failure, and reference the import_remote method and
Readability::Document when making these changes.
- Line 1: The file app/models/topic_embed.rb currently uses require_dependency
'nokogiri' which is incorrect for loading an external gem; replace the call to
require_dependency with a plain require 'nokogiri' so Nokogiri is loaded as a
gem (edit the top of topic_embed.rb where require_dependency 'nokogiri' appears
and change it to require 'nokogiri').
- Line 13: The code interpolates raw `url` into HTML via contents << ... which
risks XSS and also mutates the `contents` string; fix by HTML-escaping the `url`
before interpolation (e.g., use ERB::Util.html_escape or CGI.escapeHTML) and
avoid in-place mutation by building a new string (e.g., assign contents =
contents + ... or use a non-mutating concatenation) when calling
I18n.t('embed.imported_from', link: ...), so the link value is escaped and
`contents` is not mutated.

In `@app/views/embed/best.html.erb`:
- Line 16: The <img> tag rendering the avatar (src from
post.user.small_avatar_url) is missing an alt attribute; update the tag to
include a meaningful, escaped alt value such as the user's username (e.g., use
post.user.username or post.user.name) and provide a sensible fallback (empty
string) to ensure accessibility and avoid nil errors when the username is
missing.
- Line 6: The ERB template contains an invalid Ruby tag "end if" with no
condition; replace the dangling "end if" with a plain "end" ERB closing tag so
the conditional block is properly terminated (locate the "<%- end if %>"
occurrence in the template and change it to a simple "end" closing tag).

In `@app/views/embed/loading.html.erb`:
- Around line 6-11: The current IIFE uses setTimeout +
document.location.reload() with no stop condition, causing an unbounded reload
loop; modify the loader script to track retries (e.g., use a query param like
retry or a localStorage key such as "embed_retry_count"), increment it each
attempt inside the IIFE, and only call document.location.reload() if the count
is below a max (e.g., 5); when the max is reached, stop scheduling further
reloads and replace the page content with a friendly error message / retry
button so the embed shows an error state instead of continuously reloading.

In `@app/views/layouts/embed.html.erb`:
- Line 11: The embedded layout interpolates request.referer directly into
JavaScript (parent.postMessage) causing an XSS risk; change the interpolation to
escape the value using Rails' escape_javascript helper (use j(request.referer)
where request.referer is currently used in embed.html.erb) so the referer is
safely quoted, and add a server-side check before rendering this layout to
ensure request.referer (or derived origin) matches SiteSetting.embeddable_host
to restrict allowed target origins.

In `@db/migrate/20131217174004_create_topic_embeds.rb`:
- Line 3: The migration currently calls create_table :topic_embeds with force:
true which will silently drop the table if present; remove the force: true
option from the create_table call so the migration will raise an error if the
table already exists (preserving data) and run/update the migration tests to
ensure no further references expect destructive recreation of :topic_embeds.

In `@db/migrate/20131223171005_create_top_topics.rb`:
- Line 3: The migration currently uses create_table :top_topics, force: true
which will DROP the table if it exists and can cause silent data loss; remove
the force: true option from the create_table call and instead make the migration
idempotent by guarding the creation with a check for existence (e.g., use
table_exists?(:top_topics) or Rails' create_table ... if_not_exists option where
supported) or perform an explicit, intentional drop in the down/rollback path
only; update the migration that defines create_table :top_topics to use one of
these safer approaches so retries or replays won't destroy existing data.

In `@lib/topic_retriever.rb`:
- Around line 27-29: The current two-step lock (using
$redis.setnx(retrieved_key, "1") then $redis.expire(retrieved_key, 60)) is racy;
replace it with a single atomic Redis SET that applies NX and EX in one call
(use the Redis client SET with nx: true and ex: 60) so the key and TTL are
created atomically; adjust the conditional to check the SET return value (OK vs
nil) instead of setnx's truthiness and remove the separate expire call,
referencing retrieved_key, $redis.setnx and $redis.expire in the change.

In `@spec/controllers/embed_controller_spec.rb`:
- Around line 48-52: Create a concrete mock object for TopicView and make
TopicView.expects(:new).with(123, nil, {best: 5}) return that mock so the view
can call `@topic_view.posts` and `@topic_view.topic.url`; e.g., build a topic_view
mock, stub its posts to return an array (or an enumerable) and stub topic.url
(or topic to return a mock with url) before calling get :best so the template
can iterate and access nested attributes.

In `@spec/jobs/poll_feed_spec.rb`:
- Around line 12-28: The negative tests are not isolating the condition under
test because other preconditions may be falsy by default; update each `it` to
stub all other required SiteSetting values to valid values before calling
`poller.execute({})` so the test fails only when the targeted setting is
missing. Specifically, in the examples referencing `poller.execute`, ensure when
testing `feed_polling_url` you stub
`SiteSetting.stubs(:feed_polling_enabled?).returns(true)` and
`SiteSetting.stubs(:embed_by_username).returns("some_user")`; when testing
`embed_by_username` stub
`SiteSetting.stubs(:feed_polling_enabled?).returns(true)` and
`SiteSetting.stubs(:feed_polling_url).returns("http://example.com")`; and in the
`requires feed_polling_enabled?` case stub the other two settings to valid
values so each `poller.expects(:poll_feed).never` assertion only depends on the
single setting under test.
🧹 Nitpick comments (12)
Gemfile (1)

209-211: Consider pinning gem versions for reproducible builds.

Both ruby-readability and simple-rss are added without version constraints. While several existing gems in this Gemfile follow the same pattern, unpinned dependencies risk pulling in breaking changes on future bundle install runs. A pessimistic constraint (e.g., '~> x.y') would add safety without much maintenance overhead.

That said, the require: false usage is correct here since these are only needed in specific code paths (feed polling and remote content import).

lib/post_revisor.rb (1)

85-86: save return value is not checked — pre-existing but worth noting.

The addition of validate: !@opts[:skip_validations] is correct and safe (defaults to validate: true when the option is absent). However, @post.save(...) can return false on failure, and execution continues to save_reply_relationships on line 87 regardless. Compare with PostCreator#save_post which checks the result and raises ActiveRecord::Rollback.

This is pre-existing behavior and not a regression from this change, but if skip_validations broadens the set of callers to update_post, a silent save failure could be harder to diagnose.

app/models/topic_embed.rb (1)

56-76: Complexity is manageable but the <a> and <img> loops are duplicated.

The RuboCop hints flag ABC size and cyclomatic complexity. The two iteration blocks (lines 62-67 and 68-73) share identical logic — only the tag name and attribute name differ. Consider extracting a helper to DRY this up.

Proposed refactor
  def self.absolutize_urls(url, contents)
    uri = URI(url)
    prefix = "#{uri.scheme}://#{uri.host}"
    prefix << ":#{uri.port}" if uri.port != 80 && uri.port != 443

    fragment = Nokogiri::HTML.fragment(contents)
-   fragment.css('a').each do |a|
-     href = a['href']
-     if href.present? && href.start_with?('/')
-       a['href'] = "#{prefix}/#{href.sub(/^\/+/, '')}"
-     end
-   end
-   fragment.css('img').each do |a|
-     src = a['src']
-     if src.present? && src.start_with?('/')
-       a['src'] = "#{prefix}/#{src.sub(/^\/+/, '')}"
-     end
-   end
+   {'a' => 'href', 'img' => 'src'}.each do |tag, attr|
+     fragment.css(tag).each do |node|
+       val = node[attr]
+       if val.present? && val.start_with?('/')
+         node[attr] = "#{prefix}/#{val.sub(/^\/+/, '')}"
+       end
+     end
+   end

    fragment.to_html
  end
app/views/layouts/embed.html.erb (1)

1-20: No <meta charset> or Content-Type declaration in <head>.

Without an explicit charset (e.g., <meta charset="utf-8">), browsers may fall back to a legacy encoding, which can enable encoding-based XSS attacks on older browsers. Adding one is a low-effort hardening step for an externally-embeddable page.

Proposed fix
   <head>
+    <meta charset="utf-8">
     <%= stylesheet_link_tag 'embed' %>
spec/controllers/embed_controller_spec.rb (1)

35-38: Placing assertions in after blocks is fragile and can mask failures.

If an it block raises an exception, the after block still runs, and its assertion failure will replace the original error in the output. This makes debugging harder. Move the response assertions into each it block instead.

Proposed refactor
-      after do
-        response.should be_success
-        response.headers['X-Frame-Options'].should == "ALLOWALL"
-      end
-
       it "tells the topic retriever to work when no previous embed is found" do
         TopicEmbed.expects(:topic_id_for_embed).returns(nil)
         retriever = mock
         TopicRetriever.expects(:new).returns(retriever)
         retriever.expects(:retrieve)
         get :best, embed_url: embed_url
+        response.should be_success
+        response.headers['X-Frame-Options'].should == "ALLOWALL"
       end

       it "creates a topic view when a topic_id is found" do
         TopicEmbed.expects(:topic_id_for_embed).returns(123)
         TopicView.expects(:new).with(123, nil, {best: 5})
         get :best, embed_url: embed_url
+        response.should be_success
+        response.headers['X-Frame-Options'].should == "ALLOWALL"
       end
spec/jobs/poll_feed_spec.rb (1)

2-2: Unrelated require_dependency — this file tests PollFeed, not ProcessPost.

This appears to be a copy-paste artifact. Consider replacing it with the correct dependency:

Proposed fix
-require_dependency 'jobs/regular/process_post'
+require_dependency 'jobs/scheduled/poll_feed'
app/jobs/regular/retrieve_topic.rb (1)

1-1: Unnecessary require_dependencyemail/sender is not used in this job.

This appears to be a copy-paste artifact. Only topic_retriever is needed.

Proposed fix
-require_dependency 'email/sender'
 require_dependency 'topic_retriever'
spec/models/topic_embed_spec.rb (2)

26-37: Weak assertion pattern — unhelpful failure messages.

Line 34 uses x.start_with?(...).should be_true. If this fails, the error message will just say expected true, got false with no indication of the actual value. Prefer a matcher that surfaces the diff.

Proposed improvement
-        post.cooked.start_with?("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">").should be_true
+        post.cooked.should start_with("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">")

39-42: Update test doesn't verify content_sha1 was changed.

The AI summary mentions that re-importing with different content should update embed.content_sha1. This test only checks that the cooked content changed but doesn't assert that the underlying TopicEmbed record's content_sha1 was actually updated.

app/jobs/scheduled/poll_feed.rb (1)

20-22: feed_key is defined but never used.

This method computes a cache key but is never referenced. If it's intended for future use, consider removing it to avoid dead code, or wire it into the polling logic for change detection.

app/controllers/embed_controller.rb (1)

24-31: Strict referer host comparison may reject legitimate subdomains.

Line 26 performs an exact host comparison: URI(request.referer).host != SiteSetting.embeddable_host. If the admin sets embeddable_host to example.com, embeds from www.example.com will be rejected. Consider whether a subdomain-aware check (e.g., end_with?) is needed, or document this restriction clearly.

lib/topic_retriever.rb (1)

40-43: Synchronous full feed poll is heavy for a single URL retrieval.

Jobs::PollFeed.new.execute({}) processes the entire RSS feed inline just to find one embed URL. This works but is inefficient — every missing embed triggers a full feed import. Consider whether a targeted lookup or async enqueue would be more appropriate.

Comment on lines +5 to +12
var comments = document.getElementById('discourse-comments'),
iframe = document.createElement('iframe');
iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
iframe.id = 'discourse-embed-frame';
iframe.width = "100%";
iframe.frameBorder = "0";
iframe.scrolling = "no";
comments.appendChild(iframe);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing null-guard on discourse-comments element and no trailing-slash normalization on discourseUrl.

Two defensive concerns:

  1. If document.getElementById('discourse-comments') returns null, comments.appendChild(iframe) throws a TypeError. A guard check avoids a hard failure on pages that don't include the expected container.

  2. The URL concatenation discourseUrl + "embed/best?..." assumes discourseUrl ends with /. If it doesn't (e.g., "http://example.com"), the resulting URL is "http://example.comembed/best?...".

Suggested fix
- var comments = document.getElementById('discourse-comments'),
-     iframe = document.createElement('iframe');
- iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
+ var comments = document.getElementById('discourse-comments'),
+     iframe = document.createElement('iframe'),
+     baseUrl = discourseUrl.replace(/\/?$/, '/');
+ if (!comments) { return; }
+ iframe.src = baseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
🤖 Prompt for AI Agents
In `@app/assets/javascripts/embed.js` around lines 5 - 12, The code assumes
document.getElementById('discourse-comments') always returns an element and that
discourseUrl ends with a slash; add a null-guard and normalize discourseUrl:
fetch the container via document.getElementById('discourse-comments') and if it
is null, no-op (or bail out) before creating/appending the iframe, and ensure
discourseUrl ends with '/' (append one if missing) before building iframe.src
using discourseEmbedUrl; update references:
document.getElementById('discourse-comments'), comments,
comments.appendChild(iframe), iframe.src, discourseUrl, and discourseEmbedUrl.

Comment on lines +15 to +24
function postMessageReceived(e) {
if (!e) { return; }
if (discourseUrl.indexOf(e.origin) === -1) { return; }

if (e.data) {
if (e.data.type === 'discourse-resize' && e.data.height) {
iframe.height = e.data.height + "px";
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Origin validation via indexOf is insecure — allows prefix-matching bypass.

discourseUrl.indexOf(e.origin) is a substring check: any origin that is a prefix of discourseUrl will pass. For example, if discourseUrl is "http://discourse.example.com/", a malicious page at http://discourse.example.co would pass because indexOf returns 0.

A strict origin comparison should be used instead:

Suggested fix
  function postMessageReceived(e) {
    if (!e) { return; }
-   if (discourseUrl.indexOf(e.origin) === -1) { return; }
+   var defined = document.createElement('a');
+   defined.href = discourseUrl;
+   if (e.origin !== (defined.protocol + '//' + defined.host)) { return; }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function postMessageReceived(e) {
if (!e) { return; }
if (discourseUrl.indexOf(e.origin) === -1) { return; }
if (e.data) {
if (e.data.type === 'discourse-resize' && e.data.height) {
iframe.height = e.data.height + "px";
}
}
}
function postMessageReceived(e) {
if (!e) { return; }
var defined = document.createElement('a');
defined.href = discourseUrl;
if (e.origin !== (defined.protocol + '//' + defined.host)) { return; }
if (e.data) {
if (e.data.type === 'discourse-resize' && e.data.height) {
iframe.height = e.data.height + "px";
}
}
}
🤖 Prompt for AI Agents
In `@app/assets/javascripts/embed.js` around lines 15 - 24, The origin check in
postMessageReceived is insecure because it uses discourseUrl.indexOf(e.origin);
replace that with a strict origin comparison (e.origin === discourseUrl) or
compare against a whitelist of allowed origins, ensuring you reference the same
discourseUrl and e.origin variables; update the conditional in the
postMessageReceived function to only accept exact matches before processing
e.data and setting iframe.height.

return if user.blank?

require 'simple-rss'
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SSRF / command injection risk: Kernel#open with user-controlled string.

open(SiteSetting.feed_polling_url) uses Kernel#open, which will execute shell commands if the string starts with |. Even though this is admin-configured, use a safer alternative. Brakeman also flagged this.

Proposed fix
-      rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
+      rss = SimpleRSS.parse URI.open(SiteSetting.feed_polling_url)
🧰 Tools
🪛 Brakeman (8.0.1)

[medium] 29-29: Model attribute used in file name
Type: File Access
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/file_access/

(File Access)

🤖 Prompt for AI Agents
In `@app/jobs/scheduled/poll_feed.rb` at line 29, Replace the unsafe Kernel#open
call by validating and fetching the feed URL using URI and a direct HTTP client:
parse SiteSetting.feed_polling_url with URI.parse, ensure the scheme is "http"
or "https", fetch the response body with Net::HTTP (or a similar HTTP client)
and then pass that body string to SimpleRSS.parse; update the code around
SimpleRSS.parse and SiteSetting.feed_polling_url to perform the scheme
validation and HTTP fetch and add basic error handling for invalid URIs or
network failures.

Comment on lines +31 to +37
rss.items.each do |i|
url = i.link
url = i.id if url.blank? || url !~ /^https?\:\/\//

content = CGI.unescapeHTML(i.content.scrub)
TopicEmbed.import(user, url, i.title, content)
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

i.content may be nil — NoMethodError on .scrub.

Not all RSS/Atom items have a content element. If i.content is nil, calling .scrub will raise NoMethodError. Add a nil guard or use i.content_encoded / i.description as fallbacks.

Proposed fix
       rss.items.each do |i|
         url = i.link
         url = i.id if url.blank? || url !~ /^https?\:\/\//
 
-        content = CGI.unescapeHTML(i.content.scrub)
+        raw_content = i.content || i.description
+        next if raw_content.blank?
+        content = CGI.unescapeHTML(raw_content.scrub)
         TopicEmbed.import(user, url, i.title, content)
       end
🤖 Prompt for AI Agents
In `@app/jobs/scheduled/poll_feed.rb` around lines 31 - 37, i.content can be nil
causing a NoMethodError when calling .scrub; update the RSS item handling in the
loop so you normalize the item body before scrubbing/using it — e.g. derive a
string body by checking i.content, then fallback to i.content_encoded or
i.description (or empty string), call .to_s.scrub on that result, and pass the
cleaned string into TopicEmbed.import along with url and i.title; ensure the
nil-guarding happens before calling .scrub and reference the loop handling where
i.content is accessed and the TopicEmbed.import call.

@@ -0,0 +1,82 @@
require_dependency 'nokogiri'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

require_dependency is for Discourse's autoloaded files, not gems.

Nokogiri is an external gem and should be loaded with require, not require_dependency. require_dependency is a Rails autoloading mechanism for application code.

Proposed fix
-require_dependency 'nokogiri'
+require 'nokogiri'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
require_dependency 'nokogiri'
require 'nokogiri'
🤖 Prompt for AI Agents
In `@app/models/topic_embed.rb` at line 1, The file app/models/topic_embed.rb
currently uses require_dependency 'nokogiri' which is incorrect for loading an
external gem; replace the call to require_dependency with a plain require
'nokogiri' so Nokogiri is loaded as a gem (edit the top of topic_embed.rb where
require_dependency 'nokogiri' appears and change it to require 'nokogiri').

@@ -0,0 +1,13 @@
class CreateTopicEmbeds < ActiveRecord::Migration
def change
create_table :topic_embeds, force: true do |t|

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

force: true will silently drop the table if it already exists.

If this migration is re-run or the table already exists (e.g., from a partial deployment), force: true will destroy all existing data without warning. Consider removing it so that create_table raises an error instead, which is a safer default for production migrations.

Proposed fix
-    create_table :topic_embeds, force: true do |t|
+    create_table :topic_embeds do |t|
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
create_table :topic_embeds, force: true do |t|
create_table :topic_embeds do |t|
🤖 Prompt for AI Agents
In `@db/migrate/20131217174004_create_topic_embeds.rb` at line 3, The migration
currently calls create_table :topic_embeds with force: true which will silently
drop the table if present; remove the force: true option from the create_table
call so the migration will raise an error if the table already exists
(preserving data) and run/update the migration tests to ensure no further
references expect destructive recreation of :topic_embeds.

@@ -1,6 +1,6 @@
class CreateTopTopics < ActiveRecord::Migration
def change
create_table :top_topics do |t|
create_table :top_topics, force: true do |t|

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

force: true in a migration risks silent data loss.

force: true will DROP TABLE if top_topics already exists before recreating it. This is appropriate in schema.rb but dangerous in a migration—if the migration is replayed (e.g., db:migrate:redo) or the table was partially created in a prior attempt, all existing data is silently destroyed.

If the intent is idempotency, prefer a guard check instead:

Suggested safer alternative
-    create_table :top_topics, force: true do |t|
+    create_table :top_topics do |t|
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
create_table :top_topics, force: true do |t|
create_table :top_topics do |t|
🤖 Prompt for AI Agents
In `@db/migrate/20131223171005_create_top_topics.rb` at line 3, The migration
currently uses create_table :top_topics, force: true which will DROP the table
if it exists and can cause silent data loss; remove the force: true option from
the create_table call and instead make the migration idempotent by guarding the
creation with a check for existence (e.g., use table_exists?(:top_topics) or
Rails' create_table ... if_not_exists option where supported) or perform an
explicit, intentional drop in the down/rollback path only; update the migration
that defines create_table :top_topics to use one of these safer approaches so
retries or replays won't destroy existing data.

Comment on lines +27 to +29
if $redis.setnx(retrieved_key, "1")
$redis.expire(retrieved_key, 60)
return false

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Race condition: setnx + expire are not atomic.

If the process crashes between setnx (Line 27) and expire (Line 28), the key will persist indefinitely with no TTL, permanently blocking retrieval for that URL. Use a single atomic command instead.

Proposed fix using atomic SET with EX and NX
-      retrieved_key = "retrieved:#{`@embed_url`}"
-      if $redis.setnx(retrieved_key, "1")
-        $redis.expire(retrieved_key, 60)
-        return false
-      end
+      retrieved_key = "retrieved:#{`@embed_url`}"
+      if $redis.set(retrieved_key, "1", ex: 60, nx: true)
+        return false
+      end
🤖 Prompt for AI Agents
In `@lib/topic_retriever.rb` around lines 27 - 29, The current two-step lock
(using $redis.setnx(retrieved_key, "1") then $redis.expire(retrieved_key, 60))
is racy; replace it with a single atomic Redis SET that applies NX and EX in one
call (use the Redis client SET with nx: true and ex: 60) so the key and TTL are
created atomically; adjust the conditional to check the SET return value (OK vs
nil) instead of setnx's truthiness and remove the separate expire call,
referencing retrieved_key, $redis.setnx and $redis.expire in the change.

Comment on lines +48 to +52
it "creates a topic view when a topic_id is found" do
TopicEmbed.expects(:topic_id_for_embed).returns(123)
TopicView.expects(:new).with(123, nil, {best: 5})
get :best, embed_url: embed_url
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the EmbedController file
fd -t f "embed_controller\.rb" | grep -v spec

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 123


🏁 Script executed:

#!/bin/bash
# Search for the best method in EmbedController
rg -A 20 "def best" --type rb | grep -A 20 "embed_controller"

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 114


🏁 Script executed:

#!/bin/bash
# Also search for TopicView usage in the controller
rg "TopicView" --type rb | grep -v spec | head -20

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 114


🏁 Script executed:

cat -n app/controllers/embed_controller.rb | head -100

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 1301


🏁 Script executed:

# Search for the best method definition
rg -A 15 "def best" app/controllers/embed_controller.rb

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 466


🏁 Script executed:

# Find the associated view template
fd -t f "best" app/views/embed/ 2>/dev/null || find app/views -name "*best*" 2>/dev/null

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 117


🏁 Script executed:

cat -n app/views/embed/best.html.erb

Repository: admin-coderabbit/coderabbit-discourse-mirror

Length of output: 1379


Add explicit return value to TopicView.expects(:new) stub to ensure the template renders correctly.

The view template (app/views/embed/best.html.erb) accesses @topic_view.posts (including iteration via .each) and @topic_view.topic.url. Without an explicit return value, the Mocha mock may not properly support the enumeration and nested attribute access required by the template.

Suggested fix
topic_view = mock
TopicView.expects(:new).with(123, nil, {best: 5}).returns(topic_view)
🤖 Prompt for AI Agents
In `@spec/controllers/embed_controller_spec.rb` around lines 48 - 52, Create a
concrete mock object for TopicView and make TopicView.expects(:new).with(123,
nil, {best: 5}) return that mock so the view can call `@topic_view.posts` and
`@topic_view.topic.url`; e.g., build a topic_view mock, stub its posts to return
an array (or an enumerable) and stub topic.url (or topic to return a mock with
url) before calling get :best so the template can iterate and access nested
attributes.

Comment on lines +12 to +28
it "requires feed_polling_enabled?" do
SiteSetting.stubs(:feed_polling_enabled?).returns(false)
poller.expects(:poll_feed).never
poller.execute({})
end

it "requires feed_polling_url" do
SiteSetting.stubs(:feed_polling_url).returns(nil)
poller.expects(:poll_feed).never
poller.execute({})
end

it "requires embed_by_username" do
SiteSetting.stubs(:embed_by_username).returns(nil)
poller.expects(:poll_feed).never
poller.execute({})
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Negative tests don't isolate the condition under test — they may pass vacuously.

Each negative test only stubs one setting. If execute short-circuits on the first check (e.g., feed_polling_enabled?), then the "requires feed_polling_url" and "requires embed_by_username" tests pass because polling is disabled by default, not because the specific setting is missing.

Stub all other preconditions to valid values so each test isolates the condition it's verifying:

Proposed fix
     it "requires feed_polling_url" do
+      SiteSetting.stubs(:feed_polling_enabled?).returns(true)
       SiteSetting.stubs(:feed_polling_url).returns(nil)
+      SiteSetting.stubs(:embed_by_username).returns(embed_by_username)
       poller.expects(:poll_feed).never
       poller.execute({})
     end

     it "requires embed_by_username" do
+      SiteSetting.stubs(:feed_polling_enabled?).returns(true)
+      SiteSetting.stubs(:feed_polling_url).returns(url)
       SiteSetting.stubs(:embed_by_username).returns(nil)
       poller.expects(:poll_feed).never
       poller.execute({})
     end
🤖 Prompt for AI Agents
In `@spec/jobs/poll_feed_spec.rb` around lines 12 - 28, The negative tests are not
isolating the condition under test because other preconditions may be falsy by
default; update each `it` to stub all other required SiteSetting values to valid
values before calling `poller.execute({})` so the test fails only when the
targeted setting is missing. Specifically, in the examples referencing
`poller.execute`, ensure when testing `feed_polling_url` you stub
`SiteSetting.stubs(:feed_polling_enabled?).returns(true)` and
`SiteSetting.stubs(:embed_by_username).returns("some_user")`; when testing
`embed_by_username` stub
`SiteSetting.stubs(:feed_polling_enabled?).returns(true)` and
`SiteSetting.stubs(:feed_polling_url).returns("http://example.com")`; and in the
`requires feed_polling_enabled?` case stub the other two settings to valid
values so each `poller.expects(:poll_feed).never` assertion only depends on the
single setting under test.

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant